The project is documented at: https://github.com/jannikgreif/DataSciR_2021
| Name | Course of Studies | |
|---|---|---|
| Jannik Greif | M.Sc. Wirtschaftsinformatik | jannik.greif@st.ovgu.de |
| Kolja Günther | M.Sc. Data and Knowledge Engineering | kolja.guenther@st.ovgu.de |
| Frank Dreyer | M.Sc. Data and Knowledge Engineering | frank.dreyer@st.ovgu.de |
The present project aims to discover a significant impact of social media posts addressed to NBA players before games with respect to their influence on these players’ in-game performance. For this purpose, we considered NBA players that are highly active on Twitter and extracted tweets that are addressed to them within a short period of time before games via the Twitter API. A sentiment analysis was then applied to indicate the attitude of the posts and with the resulting sentiment polarity scores we tested if there is a correlation between social media posts and players’ on-court performance.
In the beginning of our project we want to give a brief overview on our research goal and the tasks we want to fulfill. This project aims to answer the following research question: Can we find a significant correlation between negative/positive Social Media posts related to a specific NBA player and his on-court performance in the following game?
To answer this research question we worked on the following objectives:
Objective 1: Dataset Creation Acquire game statistics of NBA players that are highly active on Twitter and the tweets they received from peers and fans in an appropriate time window before games. The game statistics should include an appropriate metric that describes how the player performed within a corresponding game. The tweets need to be preprocessed accordingly to have them in an appropriate format in order to use them for further analysis steps. The attitude of the extracted posts should be captured by assigning a sentiment score to them. The sentiment scores of the tweets a player received in the corresponding time window before a game should be aggregated accordingly and linked to the respective game. As a result, this should end in a data set in which each record contains the game statistics of a player for a specific game and the aggregated sentiment information of the tweets that were addressed to the player before the game.
Objective 2: Exploratory Data Analysis Analyze the association between the aggregated polarity scores of the tweets a player received before games and the performance of the player within the games using appropriate performance metrics. Additionally, the strength and significance of the correlation should be evaluated.
Additional objective 3: Prediction Model After we investigated the results of our analysis we decided to additionally set up a prediction model to check the findings we made. With this model we wanted to explore the influence of tweet sentiments on a prediction task of a NBA players’ on-court performance. For this purpose we fed our sentiments as features into the predictor.
The main challenge in the pre-processing phase of our project was to create suitable datasets. For the players’ performance variable we needed to create a set of datasets which cover all necessary statistics and metadata to be able to derive the needed values. For the sentiment variable of the tweets referring to one respective game and player, we first needed to narrow down the selection of players whose tweets we wanted to observe and then extract all tweets that are related to this set of players. How this was done is described in the following section.
To get the needed data about players, games, seasons and all relevant metadata, we extracted statistical datasets from basketball-reference.com, a site which provides historical basketball statistics of players and teams from various US American and European leagues including the NBA. From this we created local .csv files for different metrics.
To get started, we set up our environment and for extracting the data from basketball-reference.com we extensively used the web scraping library rvest in addition to the tidyverse.
Before extracting stats about NBA players and games, we had to check, which players even have a twitter account. Fortunately for us, basketball-reference.com provides a list of Twitter usernames of NBA players, so we loaded the account names into the player-metadata.csv, along with an unique BBRef_Player_ID, which we took over from basketball-reference.com, and the clear name of the respective players. With this set of players we now wanted to extract further statistics.
The idea behind this dataset was to create a tibble which included all the statistics of players on season-level. As a basketball season is split into a regular season (comparable to our “Bundesliga”-system) and a playoff season (comparable to a tournaments k.o.-phase) which only the best teams of one regular season can pass, basketball-reference.com provides two separate datasets, one for each season type. To combine those datasets and map them to each player in one table, we set up loops, that check for each season, whether a player actively participated in either the regular season and/or the playoffs, extract the statistics if the condition holds true and tag each tuple of either the regular season statistics or the playoff statistics with a respective label “Regular Season” or “Playoffs.” If one player didn’t participate in any of the both possibilities, we set all variables of this entry to NA. This step is necessary as the original data labels each variable entry with a respective string, like “Did not play” or “Inactive” Finally the dataset contained one tuple of statistics for each player and season/seasontype he participated in, including the following metrics:
The next step was to extract performance statistics of the NBA-players on game-granularity. With this we wanted to create our main source for the performance indicators, which we wanted to exploit for our exploratory analysis. According to Wikipedia, Twitter was founded in 2006. Probably not many NBA players had a Twitter account during that time. In 2007 only 400,000 tweets were posted per quarter. However, the popularity of Twitter skyrocketed after its founding with over 50 million daily tweets in 2010. Therefore we only considered players that actively played from 2010 onward. Since player performance metrics like +/- become rather unreliable if a player only gets a small amount of playing time, we only considered players that on average get at least two quarters of playing time (i.e. 24 minutes). Below you can see a list with all players that fulfill this restriction, along with their playing time:
In detail the player-game-stats.csv contains for each player/game combination within our observation range a detailed set of statistics, like the season type and date of the game or the team for which the player started, as well as individual performance statistics for each game, like the ones we also extracted for the season statistics but on game-granularity plus extra metrics that can be obtained for each game individually. While the already known statistics were obtained from the basic game logs, the additional data was extracted from the advanced game logs. With these advanced statistics we also added the Box Plus/Minus score, “a box score estimate of the points per 100 possessions a player contributed above a league-average player (defined as being 0.0), translated to an average team.” as described here. This metric is calculated from the box score information of a player, his position on-court, and the overall performance of the team. Explained on an example, a score of +10.0 would mean, that the overall team is 10 points per 100 possessions better with this particular player on-court than with an average player. Since the metric considers the overall performance of a player within a game (including offensive and defensive effort) we decided to use it as our main game performance indicator for the correlation analysis. The joined table of the normal and advanced game logs is represented below.
Since the tip-off time (so the time when a game starts) was missing in the game logs, the last data source we wanted to create, was a table of metadata for each game. For this purpose we extracted the NBA schedule and results from basketball-reference.com for each season from 2010 to 2021 column-wise and merged these columns into one tibble. This was then stored in the game-metadata.csv.
As you can see in the table below, the metadata we extracted contains for each match the date and starting time, the home team with their respective points and the visitor team with their respective points.
The Twitter datasets contain all tweets related to the players we want to inspect in our analysis for their on-court performance and this part of the notebook contains the whole pipeline of extracting these relevant tweets. But before we were able to exploit the API, some pre-work had to be done. First, we set up the main directory for our data to be created and the datasets we previously created from basketball-reference.com got loaded. In the next step and before we could start with the extraction of relevant tweets for our NBA players, we had to narrow down the number of players to be considered by our pipeline. The datasets above include over 227 players. As the process of collecting tweets for such a number of players would be a huge overload, we decided to pick those top players, which are most relevant for us, following some criteria we set up in the following.
First of all we picked those players, which continuously played in the regular seasons 2016/17 - 2018/19. We didn’t consider the playoffs here, as many players don’t get into the playoffs with their teams but still play a full regular season and therefore provide enough interesting game-data for our analysis. Furthermore, we only wanted those players in our dataset, who stayed at their respective team for the whole time of observation. The idea behind this was to eliminate team switches as possible factors that influence the players’ performance. Additionally we considered only those players who had on-court time in at least 80% of the games during their regular season.
As third parameter we inspected the variable “Box Plus/Minus” (BPM) in the player_game_stats dataset, which is a score-based performance indicator that already was briefly introduced in the section before. With this estimate, we wanted to extract those players, whose performance is relatively unstable in comparison to their colleagues by computing the standard deviation of performance for each player and storing them from the highest deviation in descending order. On this dataset we applied a cutoff value to get only those players, whose standard deviation was higher or equal to 8. The assumption behind this filter parameter was, that an influence of social media on the performance could only be observed, when there is a change in performance over the whole observation period. On players who have a stable performance, we would not be able to measure an impact if there was no/just very little change.
The last parameter we wanted to include into our selection concerned about the players who have a minimal follower count of 1.000 users on Twitter. For this we joined our data with the Twitter metadata we extracted along with the tweets themselves and addressed the variable ‘follower_count’ to be at least of size 1.000. Similar to the prior filter condition, the idea behind this was to have only players in consideration, who have possibly enough tweets that could generate an impact on their performance. The assumption: Players with less than 1.000 followers are very likely to don’t get a sufficient amount of tweets for our observations. Finally we then filtered our relevant_players tibble by an inner join with this selection on their common variable screen_name.
In the following you can see the list of players that fulfilled the described criteria. From the 227 NBA players only 21 were left. For the sake of data amount we decided that this was a reasonable number of players to consider.
With the given data we were now able to extract exactly those tweets we needed for our analysis. To do so, we chose to use the get_all_tweets function from the academictwitteR package.
To only extract tweets that can be assumed to be relevant for a specific game day, we delimited the time range of tweets to be considered for the extraction to the time between 24 hours and 45 minutes before a game (to be on the safe side, we first extracted tweets in a range of 48 hours before a game and boiled it down to 24 hours in an extra step). With the first limit we wanted to avoid that tweets, related to another match, get considered as it is not unusual that one player has two games in two days. The 45 minute delimiter was set according to the assumption, that it is unlikely for players to check their Twitter-account just 45 minutes before a game (It is even forbidden to players to look on their phones 15 minutes before a game). After these parameters were set, we obtained the tweets for each player of our preselection-list discussed before. Alongside with the raw text, the datetime of creation, the count how often a post was retweeted, the reply count, the favorite count and the quote count were added to the dataset of each players’ tweets. A very important step for our later analysis was to map each tweet to the respective BBRef_Player_ID and the BBRef_Game_ID, to be able to address tweets based on a player- or a game-selection. Finally, each set got stored under the players’ twitter name.
After we finished setting up our datasets and before extracting sentiments from the tweets it was reasonable to pre-process them in order to improve the accuracy of the computed sentiments, especially for our case where the text was in the form of tweets. Tweets are often written in a less formal language, including abbreviations and slang, so our focus laid in identifying and converting these phenomenons into a language that has increased machine readability. That is why we used the textclean package to apply the following pre-processing steps on each tweet:
A very important decision that should be noted is, that stemming (i.e. Porter Stemming) was not applied to the tweets since the terms are written in their natural form in the sentiment lexica and a stemming would merge words together into one stem which wouldn’t appear in the lexicon. Additionally stopword-removal was not performed to avoid the risk of removing potentially crucial valence shifters for the sentiment extraction (e.g. in “I am not happy” the term “not” negates the sentiment and should therefore not be removed). These valence shifters play a significant role in the sentiment analyzer we chose to work on our data and we will get back to this in a later section.
One of the first issues we were confronted with when we started looking into our extracted Twitter data and ran a sentiment analysis on it, was that emojis didn’t get processed properly by any analyzer we tested. But in our opinion, no component of a tweet carries emotions so strongly than these emojis (which of course already gets implied by the name). So, besides the pre-processing of the textual information of the tweets themselves, we decided to additionally handle emojis separately by letting a special emoji-analyzer, the Novak Emoji Sentiment Lexicon, run over them. They were extracted from each tweet and stored by their key representation (from the Novak Emoji Sentiment Lexicon) in a separate variable separated by white spaces in order to use them for an encapsulated emoji sentiment computation for the individual tweets.
The following table gives an idea about how the tweets look like before and after the performed pre-processing steps:
Our next task was to extract sentiments from the pre-processed tweets and extracted emojis. To solve this task we used the package sentimentr, since compared to other solutions (e.g. syuzhet and tidytext), sentimentr uses an ordered bag of words model that allows it to incorporate valance shifters before or after polarized words to negate or intensify their sentiment (e.g. “I do not like it!” or “I really like it!”). That ultimately gives sentimentr the power to much more accurately assign sentiments to text passages.
The sentiments were computed sentence-wise for each tweet and aggregated via the average_downweighted_zero sentimentr-function that downweights sentiment-scores for sentences close to zero.
The following sentiment lexica were used to compute the sentiments for each tweet by making use of the lexicon package:
In the following you can have a glimpse into the collected sentiments of each analyzer:
After all the data was now gathered and preprocessed, we were finally ready to do our exploratory analysis and investigate the main research question we wanted to answer with this project.
In the beginning we wanted to assess if the sentiment scores the different sentiment lexica provided for tweets were actually comparable. For that purpose we computed the Spearman rank correlation coefficient between tweet sentiment scores provided by each pair of sentiment lexicons to assess whether the ranking of the tweets according to one sentiment lexicon agrees with the ranking of the tweets according to another sentiment lexicon.
We then plotted the results into a heatmap:
Generally we can see that except of the Emoji Sentiment Lexicon by Novak all other sentiment lexicons seem to correlate rather well. Apparently the computed tweet sentiments from the Emoji Lexicon differ strongly from the sentiments of the other lexicons which is reasonable since the Emoji Lexicon is only computed on the emojis contained in the tweet while the other lexicons use the textual information of the tweet. The Jockers Rinker and Syuzhet lexicons are most similar with a Spearman correlation coefficient around 0.95. This intuitively also makes sense since Jockers-Rinker is a combined version of Syuzhet and Bing as mentioned before.
Since the sentiment scores were computed on a per-tweet basis we first had to aggregate the sentiment scores accordingly in order to capture the overall social media vibe players were receiving before games in a single number. For that purpose we considered the sentiment scores of all tweets a respective player received in a 24 hour window before a respective game and aggregated them as follows:
The following table shows an excerpt of the per-game computed sentiment aggregates for the different sentiment lexica:
At this point we had all the necessary data to analyze the association between the aggregated sentiment scores of tweets the players received within 24 hours before games and their performance within the games.
Before analyzing these bivariate relationships however we first wanted to get a general idea how the individual variables were distributed.
Plotting the density curves for the unweighted average sentiment scores for the different sentiment lexica and players revealed the following picture:
Looking at the individual density curves we observed that the distributions of the average sentiment scores roughly fit the bell curve of a Normal distribution despite a few exceptions (esp. for the averaged sentiments for the emoji sentiment lexicon by Novak).
To check our normality assumption we also constructed Q-Q plots for the unweighted average sentiment scores:
The Q-Q plots confirmed our assumption of normality, since despite some curve offs at the extremities (some observed extremes were more extreme than expected), most of the observed quantiles matched the expected quantiles of the fitted Normal distribution.
A similar picture could be observed for the average weighted sentiment scores (weighted by their associated retweet count) as the following equivalent grid of Q-Q plots shows:
The distributions for the negative tweet proportions mostly did not follow a Normal distribution and were strongly right skewed however. That intuitively made sense since in most of the cases players only received a small proportion of tweets with a negative sentiment which leads to the right skewness of the distribution (also because proportions cannot go below 0). The following grid of density plots emphasize that circumstance:
Besides the sentiment aggregates we also studied how the BPM performance indicator values are distributed for the different players. Similar to the unweighted and weighted sentiment averages before, BPM was also normally distributed as the following grid of Q-Q plots indicates:
Knowing that the BPM values were normally distributed for the different players it was sufficient to simply construct boxplots for the performance indicator to get a sense how the individual players performed in general and how their performance fluctuated over the two considered seasons.
After having observed the univariate distributions of the variables that were of importance for this analysis, we now wanted to assess whether there is a relationship between the individual sentiment aggregates and the BPM values for any of the individual players and sentiment lexicons.
We began by having a closer look at the relationship between the unweighted sentiment average and the BPM performance indicator. For that purpose we created a grid of scatterplots for each player and sentiment lexicon combination and fitted a simple linear regression line through each of the resulting point clouds. Additionally, to measure the strength and direction of a potential bivariate linear relationships, we made use of the ggpubr-library by adding the corresponding Pearson correlation coefficient r and its associated p-value (using a T-test statistic with n-2 degrees of freedom) to each scatterplot. It should be noted here that the Pearson correlation coefficient was applicable since both variables were normally distributed as indicated before. Furthermore, we added the p-value to measure how significant the corresponding Pearson correlation coefficient deviated from zero (no correlation / linear relationship). The resulting plot is represented below.
As one can see, the points of the different scatterplots appeared rather scattered and for the different sentiment lexicons and players there was neither a strong nor direclty visible (linear) relationship between the average tweet sentiment and the BPM performance indicator. Even though some of the linear regression lines suggested a correlation, the correlations themselves were rather weak or even neglectable as indicated by the respective Pearson correlation coefficients r that were relatively small (mostly less than 0.1). Additionally most of the p-values of the associated Pearson correlation coefficients were rather high which suggested that the observed strength of the correlations were not significantly different from 0 (and might have appeared due to random chance).
Nevertheless, there were also some counter examples where the Pearson correlation coefficient appeared rather significant. The player Jaylen Brown for example showed a positive correlation for the Afinn lexicon with a p-value below 0.05. However, since the correlations were rather weak, not significant and somehow contradicting for other sentiment lexicons (compare that the correlation was negative for the nrc lexicon), it is debatable if the positive correlation is generalizable for the entire population or even the single player alone.
Due to these reasons we had to conclude that there is no evidence of a significantly strong linear correlation between the average sentiment of tweets players receive within 24 hours before games and their performance within the games.
There was however another interesting observation the scatterplots revealed, namely the prominent outliers. For almost every player there was at least one game day in which the average tweet sentiment was vastly more positive compared to other days. Additionally there were some players with game days associated with an extremely negative average tweet sentiment in comparison to other days. To investigate these outliers more closely we created two word clouds for each player, one for the worst average tweet sentiment the player received and one for the best. We used the tweet sentiments created from the Jockers-Rinker lexicon for this purpose and mapped the 50 most frequent words that appeared in the corresponding tweets on each wordcloud.
The most prominent observation derived from the wordclouds was that the best average sentiments were frequently associated with the words “happy” and “birthday,” which indicated that these players were receiving birthday wishes on that same game day. Besides birthdays it appeared that other players received positive tweet sentiments due to another important day or event in their life. For Stephen Curry the terms “baby,” “congrats,” “boy,” “family,” “hands,” “blue” and “eyes” occurred rather frequently. By having a glimpse on the tweets he received that day, we can see people were congratulating him for another baby that was on the way:
For the worst sentiments the picture was not that clear on the other hand. Jamal Murray for example also frequently received the words “happy” and “birthday” on the game day with the worst average tweet sentiment. By a closer look on the tweets themselves however, we saw that he was only tagged on 22 tweets that day and that the birthday wishes were actually addressed to the player Dejuan Wagner, but not himself:
One noticeable observation was that swear words tended to appear more frequently in the associated tweets. E.g with regard to the worst sentiment game of Klay Thompson the terms “shots,” “missed” and “bad,” in combination with several swear words reflect a situation where the player received some hate due to some missed shots from a previous game. The example tweets are displayed below:
Nevertheless it was hard to make a general conclusion why the players received such bad tweet sentiments on the associated days from the most frequent terms alone.
One might argue that players who receive over a thousand tweets each day, is not capable of reading or even noticing all the tweets he receives. Due to this we made the assumption, that those tweets, which have a higher retweet count also have a higher likelihood of being read. Considering this, we now used the retweet-count-weighted sentiments, which already got created before, for the computation of the the mean sentiment and created a correlation plot like we did in the previous section.
Generally we can say that the weighted average sentiment also did not reveal any better correlation results. There were however also some exceptions. For Josh Richardson the Pearson correlation coefficients appeared to be rather significant with p-values less than 0.1 for the different sentiment lexica (except for the emoji lexicon by Novak). Despite the fact that these correlations were all positive for this player, they were still quite weak however, with a maximum Pearson correlation coefficient of 0.29 for the Jockers-Rinker sentiment lexicon. Additionally it is to say that this is an exception and does not alter the general observation.
We created the same plot a third time, but now only considering the proportion of negative tweets a player received at each respective game. Since the distributions of the negative proportions were heavily right skewed and not normally distributed, we used the Kendall-Tau rank correlation coefficient instead of the Pearson metric.
Unfortunately, this aggregate delivered a similar picture regarding the correlation we wanted to inspect.
Finally, we investigated if there is maybe a time dependent relationship between the average sentiment score and the BPM value the players received. For that purpose we exemplary plotted the Jockers-Rinker average sentiments and the BPM values against the game dates on which they were observed for each player for the season 2018-19. Since both variables fluctuated quite strongly over the considered timespan we overlaid a smoothed average line to improve the interpretation ability. The resulting plot is displayed below.
As one can see, there appeared to be also no time-dependent association between the variables.
Concluding to this section we can wrap up our observations as follows:
One attempt to check whether the sentiments have an influence on the players’ performance at all and therefore to investigate our findings in the correlation analysis, was to set up a random forest regression model to predict the BPM of players. The idea was to set up different models and compare the predictions with and without the sentiments as input features. We set up the following models:
In detail, Homegame expresses whether the player played a homegame (1) or an away game (0). The Trend indicates the teams last 5 game performances - i.e. the sum of wins (+1) and losses (-1) is calculated. In order to measure a more longterm team performance we used the SRS (Simple Rating System) which gives a score to each team according to their average point difference and strength of schedule and where 0 marks the average score. The SRS for the previous season was used for the prediction.
We started by loading the tweets and extracted the necessary columns before transforming them.
After that the SRS (Simple Rating System) for the previous season are extracted. This was done by parsing the html table from basketball-reference.com and using the teams shortcut to combine the data with the previous dataset.
Then some of the variables got renamed and the mean BPM of the last 5 BPMs are calculated.
In the last pre-processing step, the already computed sentiments got merged with the newly created data.
For the actual prediction task we started by selecting the relevant columns for the model and split the data into training and testing data with a proportion of 75% training to 25% testing. A validation split of 20% of the training set was then performed in order to combat overfitting. Then each model was created using the recipe package. Using tune_grid(), the model parameter mtry (number of predictors that will be randomly sampled at each split when creating the tree models), trees (number of trees contained in the ensemble) and min_n (minimum number of data points in a node that are required for the node to be split further) are tuned. After tuning, we chose the best model fit based on the RMSE (Root Mean Square Error) which calculates the average distance between the predicted values and the actual values, i.e. the lower the RMSE, the better the model is able to fit a dataset. The importance score for each variable was saved as well.
In order to determine which model predicted the BPM performance best, a visualization is shown below that displays the predicted value (y-axis) and the true value (x-axis) for each model, meaning a perfectly fitted model would have all predictions on the the dashed line. The visualization doesn’t allow for a clear interpretation since all models are scattered and no specific trend can be observed.
Therefore, to determine which model predicted the values best, the RMSE and RSQ (R-Squared: proportion of variance in the dependent variable that can be explained by the independent variables) was used to further analyze the different models. The bar plot shows that the RMSE are in the range of 7.7049 to 8.1546. Using RMSE as a metric the model without sentiment scores predicts the BPM best. The difference between the RSQ of the models on the other hand is quite large. Here, all models don’t explain the BPM well ranging from 0.0187 to 0.1007. So even here, a direct influence of the sentiments as a feature can not be derived from the model, underlining our findings in the correlation analysis that the sentiments have no significant influence on a players’ in-game performance.
As last step we looked at the importance scores of each sentiment analyzer method for the model including sentiments. Here it can be observed that the syuzhet and jockers_rinker sentiment scores rank first and second. The (permutation) importance score is calculated by (1) measuring a basline RSQ , (2) permuting the values of one feature and measure the RSQ. The importance score is the difference between the basline and the drop in overall RSQ caused by the permutation. The plot indicates a high influence of sentiment scores, but since the model has a low RSQ, this should be perceived carefully. To further understand the influence of sentiment scores, a new model should be deployed in further research with the goal to increase the RSQ and analyze the feature importance again.
To the end of our project it is time to dedicate a section to the final results and findings of the whole work. It turned out that the most consuming tasks were not only to run the exploratory analyses on our data but also to gather and preprocess the data itself. Especially the processing of the twitter data has held some unexpected challenges, namely the proper handling of emojis, which we solved with an own sentiment lexicon, and the overall handling of the Twitter API, which turned out to be more complex than expected. Nonetheless we were able to generate meaningful data, that contains all variables needed to run some interesting analyses on it. Especially Twitter offered a wide range of metadata that got delivered with each tweet (e.g. the retweet count, which was vital for one of our correlation approaches).
With the exploratory data analysis we wanted to answer our initial research questions. For this purpose we ran different approaches over the data to check, whether there is a significant correlation between the average sentiment of tweets a player receives before a game and his performance in-game. Unfortunately, it is to say, that our hypothesis doesn’t hold and there is no such significance to be observed.
To further investigate the findings we made, we wanted to elaborate the impact of the sentiments on a prediction model that predicts the players BPM performance score. The idea: If the sentiments highly contribute to the prediction and this prediction then is relatively good, it could be interpreted as indicator, that the significant correlation exists after all and we just made some mistakes in the analysis setup. And indeed did the majority of the sentiment scores highly contribute to the prediction. But unfortunately the prediction was quite poor. This could be interpreted as the sentiments pushing the predictor into a wrong learning direction and therefore are not significantly correlated to the prediction outcome.
Reflecting the overall project and its outcome some consideration regarding the whole project setup can be done: * For the twitter data, domain specific and more advanced sentiment extraction methods could be found or existing analyzers be tuned * For the prediction model, the input variables should be reviewed to gain a better prediction outcome * Generally the correlation analysis could be decoupled from the BPM performance variable and be applied to component variables of the BMP score (e.g. a correlation between the sentiments and the 3-point-field-goal percentage) * Further assumptions could be incorporated into the process, like the fact that some players don’t manage their own twitter accounts at all but let professional social-media agencies monitor the activities. Such accounts are of course irrelevant for our analysis